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ABSTRACT 


The  Center  for  Information  Technology  (CIT)  of  the  University  of  South  CaroHna  (USC) 
participated  in  the  Autonomous  Negotiating  Teams  (ANTS)  program  funded  by  DARPA 
since  its  inception.  The  USC  team  was  involved  in  two  efforts.  The  first  was  to  develop  a 
resource  allocation  architecture  called  TargetShare,  in  which  agents  used  utility  to  negotiate 
and  allocate  resources  That  effort  was  led  by  Dr.  Jose  Vidal.  The  second  effort,  led  by  Dr. 
Juan  E.  Vargas,  was  focused  on  providing  a  multi-target  tracking  system,  called  the 
SCTracker,  compatible  with  the  program’s  challenge  problem  infrastructure.  This  report 
describes  the  two  efforts. 
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A.  TARGETSHARE 


1.  INTRODUCTION 


TargetShare  is  an  architecture  based  on  a  method  for  solving  service  allocation  problems  in 
which  a  set  of  services  must  be  allocated  to  a  set  of  agents  so  as  to  maximize  a  global  utility. 
The  method  is  completely  distributed  so  it  can  scale  to  any  number  of  services  without 
degradation.  In  this  report,  we  formalize  the  service  allocation  problem  and  then  present  a 
simple  hill-climbing,  a  global  hill  climbing,  and  a  bidding-protocol  algorithm  for  solving  it. 
We  then  analyze  the  expected  performance  of  these  algorithms  as  a  function  of  various 
problem  parameters  such  as  the  branching  factor  and  the  number  of  agents.  Finally,  we  use 
the  sensor  allocation  problem,  an  instance  of  a  service  allocation  problem,  to  show  the 
bidding  protocol  at  work.  The  simulations  also  show  that  phase  transition  on  the  expected 
quality  of  the  solution  exists  as  the  amount  of  communication  between  agents  increases. 

The  problem  of  dynamically  allocating  services  to  a  changing  set  of  consumers  arises  in 
many  applications.  For  example,  in  an  e-commerce  system,  the  service  providers  are  always 
trying  to  determine  which  service  to  provide  to  whom,  and  at  what  price  [5];  in  an 
automated  manufacturing  for  mass  customization  scenario,  agents  must  decide  which 
services  will  be  more  popular/profitable  [1];  and  in  a  dynamic  sensor  allocation  problem,  a 
set  of  sensors  in  a  field  must  decide  which  area  to  cover,  if  any,  while  preserving  their 
resources. 

While  these  problems  might  not  seem  related,  they  are  instances  of  a  more  general  service 
allocation  problem  in  which  a  finite  set  of  resources  can  be  allocated  by  a  set  of  autonomous 
agents  so  as  to  maximize  some  global  measure  of  utility.  A  general  approach  to  solving  these 
types  of  problems  has  been  used  in  many  successful  systems,  such  as  [2]  [3]  [11]  [9].  The 
approach  involves  three  general  steps:  1.  Assign  each  resource  that  needs  to  be  preserved  to 
an  agent  responsible  for  managing  the  resource.  2.  Assign  each  goal  of  the  problem  domain 
to  an  agent  responsible  for  achieving  it.  Achieving  these  goals  requires  the  consumption  of 
resources.  3.  Have  each  agent  take  actions  so  as  to  maximize  its  own  utility,  but  implement  a 
coordination  algorithm  that  encourages  agents  to  take  actions  that  also  maximize  the  global 
utility.  In  this  report  we  formalize  this  general  approach  by  casting  the  problem  as  a  search  in 
a  global  fitness  landscape  which  is  defined  as  the  sum  of  the  agents’  utilities.  We  show  how 
the  choice  of  a  coordination/ communication  protocol  disseminates  information,  which  in 
turn  ''smoothes”  the  global  utility  landscape.  This  smooth  global  utility  landscape  allows  the 
agents  to  easily  find  the  global  optimum  by  making  selfish  decisions  to  maximize  their 
individual  utility.  We  also  present  experiments  that  pinpoint  the  location  of  a  phase 
transition  in  the  time  it  takes  for  the  agents  to  find  the  optimal  allocation.  The  transition  can 
be  seen  when  the  amount  of  communication  allowed  among  agents  is  manipulated.  It  exists 
because  communication  allows  the  agents  to  align  their  individual  landscapes  with  the  global 
landscape.  At  some  amount  of  communication,  the  alignment  between  these  landscapes  is 
good  enough  to  allow  the  agents  to  find  the  global  optimum,  but  less  communication  drives 
the  agents  into  a  random  behavior  from  which  the  system  cannot  recuperate. 
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1.1  Task  A^llocation 


The  service  allocation  problem  we  discuss  is  a  superset  of  the  well  known  task  allocation 
problem.  A  task  allocation  problem  is  defined  by  a  set  of  tasks  that  must  be  allocated  among 
a  set  of  agents.  Each  agent  has  a  cost  associated  with  each  subset  of  tasks,  which  represents 
the  cost  the  agent  would  incur  if  it  had  to  perform  those  tasks.  Coordination  protocols  are 
designed  to  allow  agents  to  trade  tasks  so  that  the  globally  optimal  allocation  (the  one  that 
minimizes  the  sum  of  all  the  individual  agent  costs)  is  reached  as  soon  as  possible.  It  has 
been  shown  that  this  globally  optimal  allocation  can  bG  reached  if  the  agents  use  the  contract- 
net  protocol  [9]  with  Original  Cluster  Swap  Multi  (OCSM)  contracts  [8].  These  OCSM 
contracts  make  it  possible  for  the  system  to  transition  from  any  allocation  to  any  other 
allocation  in  one  step.  As  such,  a  simple  hill-climbing  search  is  guaranteed  to  eventually 
reach  the  global  optimum.  We  consider  the  service  allocation  problem,  which  is  a  superset 
of  the  task  allocation  because  it  allows  for  more  than  one  agent  to  service  a  ''task”.  The 
service  allocation  problem  we  study  also  has  the  characteristic  that  every  allocation  cannot  be 
reached  from  every  other  allocation  in  one  step. 

1 .2  Service  Allocation 


In  a  service  allocation  problem  there  are  a  set  of  services,  offered  by  service  agents,  and  a  set 
of  consumers  who  use  those  services.  A  server  can  provide  any  one  of  a  number  of  services 
and  some  consumers  will  benefit  from  that  service  without  depleting  it.  A  server  agent  incurs 
a  cost  when  providing  a  service  and  can  choose  not  to  provide  any  service.  For  example,  a 
server  could  be  an  agent  that  sets  up  a  website  with  information  about  cats.  All  the 
consumer  agents  with  interests  in  cats  will  benefit  from  this  service,  but  those  with  other 
interests  will  not  benefit.  Since  each  server  can  provide,  at  most,  one  service,  the  problem  is 
to  find  the  allocation  of  services  that  maximizes  the  sum  of  all  the  agents’  utilities,  that  is,  an 
allocation  that  maximizes  the  global  utility. 

1.2.1  Sensor  Allocation 

Another  instance  of  the  service  allocation  problem  is  the  sensor  allocation  problem,  which 
we  will  use  as  an  example  throughout  this  report.  In  the  sensor  allocation  problem  we  have 
a  number  of  sensors  placed  in  fixed  positions  in  a  two-dimensional  space.  Each  sensor  has  a 
Hmited  viewing  angle  and  distance  but  can  point  in  any  one  of  a  number  of  directions.  For 
example,  a  sensor  might  have  a  viewing  angle  of  120  degrees,  viewing  distance  of  3  feet,  and 
be  able  to  look  in  three  directions,  each  one  120  degrees  apart  from  the  others.  That  is,  it  can 
"look”  in  any  one  of  three  directions.  In  each  direction  it  can  see  everything  that  is  in  the 
120  degree  and  3  feet  long  view  cone.  Each  time  a  sensor  looks  in  a  particular  direction  it 
uses  energy.  There  are  also  targets  that  move  around  in  the  field.  The  goal  is  for  the  sensors 
to  detect  and  track  aU  the  targets  in  the  field.  However,  in  order  to  determine  the  location  of 
a  target,  two  or  more  sensors  have  to  look  at  it  at  the  same  time.  We  also  wish  to  minimize 
the  amount  of  energy  spent  by  the  sensors. 

We  consider  the  sensor  agents  as  being  able  to  provide  three  services,  one  for  each  sector. 
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but  only  one  at  a  time.  We  consider  the  target  agents  as  consuming  the  services  of  the 
sensors. 

2.  A  FORMAL  MODEL  FOR  SERVICE  ALLOCATION 

We  define  a  service  allocation  problem  SA  as  a  pair  SA  =  {C,  S}  where  C  is  the  set  of 
consumer  agents  C  =  62,  ...  y  and  c-  has  only  one  possible  state,  c-  —  0.  The  set  of 

service  agents  is  S  =  {s^,  S2,  .  .  .  ,  S|s|}  and  the  value  of  r-  is  the  value  of  that  service.  For  the 
sensor  domain  in  which  a  sensor  can  observe  any  one  of  three  120-degree  sectors  or  be 
turned  off  we  have  s^  E  {0,  1,  2,  off}.  An  allocation  is  an  assignment  of  states  to  the  services 
(since  the  consumers  have  only  one  possible  state  we  can  ignore  them).  A  particular 
allocation  is  denoted  by  =  {s^  S2y  ...  y  r|^| },  where  the  r-  have  some  value  taken  from  the 
domain  of  service  states,  and  a  e  A,  where  M  is  the  set  of  aU  possible  allocations.  That  is,  an 
allocation  tells  us  the  state  of  all  agents  (since  consumers  have  only  one  state  they  can  be 
omitted).  Each  agent  also  has  a  utility  function.  The  utility  that  an  agent  receives  depends  on 
the  current  allocation  a,  where  we  let  2i(s)  be  the  state  of  service  agent  r  under  a.  The  agent’s 
utilities  will  depend  on  their  state  and  the  state  of  other  agents.  For  example,  in  the  sensor 
problem  we  define  the  utility  of  sensor  r  as  Us(a),  where 


I  0  ifa(s)=0ff 
[-  ki  otherwise 


[eq  1] 


That  is,  a  sensor  receives  no  utility  when  it  is  off  and  must  pay  a  penalty  of  -K1  when  it  is 
running.  The  targets  are  the  consumers,  and  each  target’s  utility  is  defined  as  U 

f  0  if  fc{a)  =  0 

U^{a)  =  \  k2  if  fc{a)  =  \  2] 

\k2+n-2  if  fc{a)  =  n 


where  fc(a)  =  number  of  sensors  s  that  see  c  given  their  state  a(c).  [eq  3] 

Finally,  given  the  individual  agent  utilities,  we  define  the  global  utility 
GU(a)  as  the  sum  of  the  individual  agents’  utilities: 


GU{a)  =  X  Uciaft  ^Us{a) 

ceC  seS 


[eq4] 


The  service  allocation  problem  is  to  find  the  allocation  a  that  maximizes  GU(a).  In  the 
sensor  problem,  there  are  4\S\  possible  allocations,  which  would  make  a  simple  generate- 
and-test  approach  take  exponential  amounts  of  time.  We  wish  to  find  the  global  optimum 
much  faster  than  that. 


2. 1  Search  Algorithms 

Our  goal  is  to  design  an  interaction  protocol  whereby  an  allocation  a  that  maximizes  the 
global  utility  GU(a)  is  reached  in  a  small  number  of  steps.  In  each  step  of  our  protocol 
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one  of  the  agents  will  change  its  state  or  send  a  message  to  another  agent.  The  messages 
might  contain  the  state  or  utilities  of  other  agents.  We  assume  that  the  agents  do  not  have 
direct  access  to  the  other  agents’  states  or  utility  values.  A  simpler  algorithm  could  involve 
having  each  consumer,  at  each  time,  changing  the  state  of  a  randomly  chosen  service  agent 
so  as  to  increase  the  consumer’s  own  utility.  That  is,  a  consumer  c  will  change  the  current 
allocation  a  into  ag  by  changing  the  state  of  some  sensor  r  such  that  U/ag)  >  U/a).  If  the 
sensor’s  state  cannot  be  changed  so  as  to  increase  the  utility,  then  the  consumer  does 
nothing.  In  the  sensor  domain  this  amounts  to  a  target  picking  a  sensor  and  changing  its 
state  so  that  the  sensor  can  see  the  target.  We  refer  to  this  algorithm  as  individual  hill¬ 
climbing.  The  individual  hill-climbing  algorithm  is  simple  to  implement  and  the  only 
communication  needed  is  between  the  consumer  and  the  chosen  server.  This  simple 
algorithm  makes  every  consumer  agent  increase  its  individual  utility  at  each  turn.  However, 
the  new  allocation  ag  might  result  in  a  lower  global  utility,  since  ag  might  reduce  the  utility  of 
several  other  agents.  Therefore,  it  does  not  guarantee  that  an  optimal  allocation  will  be 
eventually  reached.  Another  approach  is  for  each  agent  to  change  state  so  as  to  increase  the 
global  utility.  We  call  this  a  global  hill-climbing  algorithm. 

In  order  to  implement  this  algorithm,  an  agent  would  need  to  know  how  the  proposed  state 
change  affects  the  global  utility  as  well  as  the  states  of  all  the  other  agents.  That  is,  it  would 
need  to  be  able  to  determine  GU(a^)  which  requires  it  to  know  the  state  of  all  the  agents  in  ag 
as  well  as  the  utility  functions  of  every  other  agent,  as  per  the  definition  of  global  utility  in 
equation  4.  In  order  for  an  agent  to  know  the  state  of  others,  it  would  need  to  somehow 
communicate  with  all  other  agents.  If  the  system  implements  a  global  broadcasting  method 
then  we  would  need  for  each  agent  to  broadcast  its  state  at  each  time.  If  the  system  uses 
more  specialized  communications  such  as  point-to-point,  limited  broadcasting,  etc.,  then 
more  messages  will  be  needed.  Any  protocol  that  implements  the  global  hill-climbing 
algorithm  will  reach  a  locally  optimal  allocation  in  the  global  utility.  This  is  because  it  is 
always  true  that,  for  a  new  allocation  ag  and  old  allocation  a,  GU(a^)  ^  GU(a).  Whether  or  not 
this  local  optimum  is  also  a  global  optimum  will  depend  on  the  ruggedness  of  the  global 
utility  landscape.  That  is,  if  it  consists  of  one  smooth  peak  then  it  is  likely  that  any  local 
optimum  is  the  global  optimum.  On  the  other  hand,  if  the  landscape  is  very  rugged  then 
there  are  likely  many  local  peaks. 

Studies  in  NK  landscapes  [4]  tell  us  that  smoother  landscapes  result  when  an  agent’s  utility 
depends  on  the  state  of  smaller  number  of  other  agents.  Global  hill-climbing  is  better  than 
individual  hill-climbing  since  it  guarantees  that  we  will  find  a  local  optima.  However,  it 
requires  agents  to  know  each  others’  utility  function  and  to  constantly  communicate  their 
state.  Such  large  amount  of  communication  is  often  undesirable  in  multi-agent  systems.  We 
need  a  better  way  to  find  the  global  optimum. 

One  way  of  correlating  the  individual  landscapes  to  the  global  utility  landscape  is  with  the 
use  of  a  bidding  protocol  in  which  each  consumer  agent  tells  each  service  the  marginal  utility 
the  consumer  would  receive  if  the  service  switched  its  state  so  as  to  maximize  the 
consumer’s  utility.  The  service  agent  can  then  choose  to  provide  the  service  with  the  highest 
aggregate  demand.  Since  the  service  is  picking  the  value  that  maximizes  the  utility  of 
everyone  involved  (all  the  consumers  and  the  service)  without  decreasing  the  utility  of 
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anyone  else  (the  other  services)  this  protocol  is  guaranteed  to  never  decrease  the  global 
utility.  This  bidding  protocol  is  a  simplified  version  of  the  contract-net  [9]  protocol  in  that  it 
does  not  require  contractors  to  send  requests  for  bids.  However,  in  order  for  a  consumer  to 
determine  the  marginal  utility  it  will  receive  from  one  sensor  changing  state,  it  still  needs  to 
know  the  state  of  all  the  other  sensors.  This  means  that  a  complete  implementation  of  this 
protocol  will  still  require  a  lot  of  communication  (namely,  the  same  amount  as  in  global  hill¬ 
climbing).  We  can  reduce  this  number  of  messages  by  allowing  agents  to  communicate  with 
only  a  subset  of  the  other  agents  and  making  their  decisions  based  on  only  this  subset  of 
information.  That  is,  instead  of  all  services  telling  each  consumer  their  state,  a  consumer 
could  receive  state  information  from  only  a  subset  of  the  services  and  make  its  decision 
based  on  this  (assuming  that  the  services  chosen  are  representative  of  the  whole).  This 
strategy  shows  a  lot  of  promise  but  its  performance  can  only  be  evaluated  on  an  instance-by¬ 
instance  basis.  We  explore  this  strategy  experimentally  in  Section  3  using  the  sensor  domain. 

2.1.1  Theoretical  Time  Bounds  of  Global  Hill-Climbing 

We  now  know  that  global  hill-climbing  will  always  reach  a  local  optimum,  the  next  questions 
we  must  answer  are: 

1 .  How  many  local  optima  are  there? 

2.  What  is  the  probability  that  a  local  optimum  is  the  global  optimum? 

3.  How  long  does  it  take,  on  average,  to  reach  a  local  optimum? 

Let  a  be  the  current  allocation  and  Uq  be  a  neighboring  allocation.  We  know  that  is  a  local 
optimum  if 


a'.Nia)GU{a)>GU{a') 


[eq5] 


where 


N(a)  —  {x  I  X  is  a  Neighbor  of  a]  [eq  6] 

We  define  a  Neighbor  allocation  as  an  allocation  where  one,  and  only  one,  agent  has  a 
different  state. 

The  probability  that  some  allocation  J  is  a  local  optimum  is  simply  the  probability  that 
equation  5  is  true.  If  the  utility  of  all  pairs  of  neighbors  is  not  correlated,  then  this  probability 
is 


Pi*[V.'eiV(.)CU(a)  >  GU{a')-\  =  Pr[GU)  >  GU{a')f  [eq  7] 

where  b  is  the  branching  factor.  In  the  sensor  problem  b  —  3  *  |  L  |  where  S  is  the  set  of  all 
sensors.  That  is,  since  each  sensor  can  be  in  any  of  four  states  it  will  have  three  neighbors 
from  each  state.  In  some  systems  it  is  safe  to  assume  that  the  global  utilities  of  ds  neighbors 
are  independent.  However,  most  systems  show  some  degree  of  correlation. 
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Now  we  need  to  calculate  the  Pr[GU(a)  >  GU(aO)],  that  is,  the  probability  that  some 
allocation  a  has  a  greater  global  utility  than  its  neighbor  for  all  a  and  a^.  This  could  be 
calculated  via  an  exhaustive  enumeration  of  all  possible  allocations.  However,  often  we  can 
find  the  expected  value  of  this  probability.  For  example,  in  the  sensor  problem  each  sensor 
has  four  possible  states.  If  a  sensor  changes  its  state  from  sector  x  to  sector  j  the  utility  of 
the  target  agents  covered  by  x  will  decrease  while  the  utility  of  those  in  j  will  increase.  If  we 
assume  that,  on  average,  the  targets  are  evenly  spaced  on  the  field,  then  the  global  utilities 
for  both  of  these  are  expected  to  be  the  same.  That  is,  the  expected  probability  that  the 
global  utility  of  one  allocation  is  bigger  than  the  other  is  1/2.  If,  on  the  other  hand,  a  sensor 
changes  state  from  ''off’  to  a  sector,  or  from  a  sector  to  "off,”  the  global  utility  is  expected 
to  decrease  and  increase,  respectively.  However,  there  are  an  equal  number  of  opportunities 
to  go  from  "off’  to  "on”  and  vice-versa.  Therefore,  we  can  also  expect  that  for  these  cases 
the  probability  that  the  global  utility  of  one  allocation  is  bigger  than  the  other  is  1/2.  Based 
on  these  approximations,  we  can  declare  that  for  the  sensor  problem 

P^\ya'eN(a)GU(a)  >  GU(a')]  =  ^  ^  [eq  8] 

If  we  assume  an  even  distribution  of  local  optima,  the  total  number  of  local  optima  is  simply 
the  product  of  the  total  number  of  allocations  times  the  probability  that  each  one  is  a  local 
optimum.  That  is. 

Total  number  of  local  optima  =  5^  |  A  |  [eq  9] 


For  the  sensor  problem,  b  —  3  *  ji’l  and  |xl|  =  so  the  expected  number  of 

local  optima  is 


Fr[aLocalOptimumIsAlsoGlobal]  =  ^  ^  ^  ^  =  — 


[eqlO] 


We  can  find  the  expected  time  the  algorithm  will  take  to  reach  a  local  optimum  by 
determining  the  maximum  number  of  steps  from  every  allocation  to  the  nearest  local 
optimum.  This  gives  us  an  upper  bound  on  the  number  of  steps  needed  to  reach  the  nearest 
local  optimum  using  global  hill-climbing.  Notice  that,  under  either  individual  hill-climbing  or 
the  bidding  protocol  it  is  possible  that  the  local  optimum  is  not  reached,  or  is  reached  after 
more  steps,  since  these  algorithms  can  take  steps  that  lower  the  global  utility.  In  order  to 
find  the  expected  number  of  steps  to  reach  a  local  optimum,  we  start  at  any  one  of  the  local 
optima  and  then  traverse  all  possible  links  at  each  depth  d  until  aU  possible  allocations  have 
been  visited.  This  occurs  when 

k  |A|  •bd>  |A|  [eqll] 

Solving  for  d,  and  remembering  that  2  =  ,  we  get 
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d  >  b  logb  2. 


[eql2] 


The  expected  worst-case  distance  from  any  point  to  the  nearest  local  optimum  is,  therefore, 
b  log^  2  (this  number  only  makes  sense  for  h  ^2  since  smaller  number  of  neighbors  do  not 
form  a  searchable  space).  That  is,  the  number  of  steps  to  reach  the  nearest  local  optima  in 
the  sensor  domain  is  proportional  to  the  branching  factor  b,  which  is  equal  to  3  *  |  i’  | .  We 
can  expect  search  time  to  increase  linearly  with  the  number  of  sensors  in  the  field. 


3.  SIMULATIONS 

While  the  theoretical  results  above  give  us  some  bounds  on  the  number  of  iterations  before 
the  system  is  expected  to  converge  to  a  local  optimum,  the  bounds  are  rather  loose  and  do 
not  tell  us  much  about  the  dynamics  of  the  executing  system.  Also,  we  cannot  show 
mathematically  how  changes  in  the  amount  of  communication  change  the  search.  Therefore, 
we  have  implemented  a  service  allocation  simulator  to  answer  these  questions.  It  simulates 
the  sensor  allocation  domain  described  in  the  introduction.  The  simulator  is  written  in  Java 
and  the  source  code  is  available  upon  request.  It  gathers  and  analyzes  data  from  any  desired 
number  of  runs.  The  program  can  analyze  the  behavior  of  any  number  of  target  and  sensor 
agents  on  a  two-dimensional  space,  and  the  agents  can  be  given  any  desired  utility  function. 
The  program  is  limited  to  static  targets.  That  is,  it  only  considers  the  one-shot  service 
allocation  problem. 

Each  new  allocation  is  completely  independent  of  any  previous  one.  In  the  tests  we 
performed,  each  run  has  seven  sensors  and  seven  targets,  all  of  which  are  randomly  placed 
on  a  two-dimensional  grid.  Each  sensor  can  only  point  in  one  of  three  directions  or  sectors. 
These  three  sectors  are  the  same  for  all  sensors  (specifically,  the  first  sector  is  from  0  to  120 
degrees,  the  second  one  from  120  to  240,  and  the  third  one  from  240  to  360).  All  the  sensors 
use  the  same  utility  function  which  is  given  by  equation  1,  while  the  targets  use  equation  2. 
After  a  sensor  agent  receives  all  the  bids  it  chooses  the  sector  that  has  the  highest  aggregate 
demand,  as  described  by  the  bidding  protocol.  During  a  run,  each  of  the  targets  periodically 
sends  a  bid  to  a  number  of  sensors  asking  them  to  turn  to  the  sector  that  faces  the  target. 
We  set  the  bid  amount  to  a  fixed  number  for  these  tests.  Periodically,  the  sensors  count  the 
number  of  bids  they  have  received  for  each  sector  and  turn  their  detector  (such  as  a  radar)  to 
face  the  sector  with  the  highest  aggregate  demand. 

We  assume  that  neither  the  targets  nor  the  sensors  can  form  coalitions.  We  vary  the  number 
of  sensors  to  which  the  targets  send  their  bids  in  order  to  explore  the  quality  of  the  solution 
that  the  system  converges  upon  as  the  amount  of  communication  changes.  For  example,  at 
one  extreme  if  all  the  targets  send  their  bids  to  all  the  sensors,  then  the  sensors  would  always 
set  their  sector  to  be  the  one  with  the  most  targets.  This  particular  service  allocation  should, 
usually,  be  the  best.  However,  it  might  not  always  be  the  optimal  solution.  For  example,  if 
seven  targets  are  clustered  together  and  the  eighth  is  on  another  part  of  the  field,  it  would  be 
better  if  six  sensor  agents  pointed  towards  the  cluster  of  targets  while  the  remaining  two 
sensor  agents  pointed  towards  the  stray  target  rather  than  having  all  sensor  agents  point 
towards  the  cluster  of  targets.  At  the  other  extreme,  if  all  the  targets  send  their  bids  to 
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only  one  sensor  then  they  will  minimize  communications  but  then  the  sensors  will  point  to 
the  sector  from  which  they  received  a  message,  i.e.,  an  allocation  which  is  likely  to  be 
suboptimal. 

These  simulations  explore  the  ruggedness  of  the  system’s  global  utility  landscape  and  the 
dynamics  of  the  agents’  exploration  of  this  landscape.  If  the  agents  were  to  always  converge 
on  a  local  (non-global)  optimum  then  we  would  deduce  that  this  problem  domain  has  a  very 
rugged  utility  landscape.  On  the  other  hand,  if  they  usually  manage  to  reach  the  global 
optimum  then  we  could  deduce  a  smooth  utility  landscape. 


4.  TEST  RESULTS 

In  each  of  our  tests  we  set  the  number  of  agents  that  each  target  wiU  send  its  bid  to,  that  is, 
the  number  of  neighbors,  to  a  fixed  number.  Given  this  fixed  number  of  neighbors,  we  then 
generated  100  random  placements  of  agents  on  the  field  and  ran  our  bidding  algorithm  10 
times  on  each  of  those  placements.  Finally,  we  plotted  the  average  solution  quality,  over  the 
10  runs,  as  a  function  of  time  for  each  of  the  100  different  placements.  The  solution  quality 
is  given  by  the  ratio 


CurrentUtility 

GlobalOptimalUtility 


[eql3] 


when  a=  1  the  run  has  reached  the  global  optimum.  Since  the  number  of  agents  is  small,  we 
were  able  to  calculate  the  global  optimum  using  a  brute-force  method.  Specifically,  there  are 
3^  =  2187  possible  configurations  times  100  random  placements  leads  to  218700 
combinations  that  we  had  to  check  for  each  run  in  order  to  find  the  global  optimum  using 
brute-force. 

Due  to  the  large  number  of  configurations,  using  more  than  7  sensors  to  run  the  simulation 
was  not  practical.  Notice,  however,  that  our  algorithm  is  much  faster  than  this  brute-force 
search  which  we  perform  only  to  confirm  that  our  search  does  find  the  global  optimum.  In 
our  tests  there  were  always  seven  target  agents  and  seven  sensor  agents.  We  varied  the 
number  of  neighbors  from  1  to  7.  If  the  target  can  only  communicate  with  one  other  sensor, 
the  sensors  will  Hkely  have  very  Httle  information  for  making  their  decision,  while  if  all 
targets  communicate  with  all  seven  sensors,  then  each  sensor  will  generally  be  able  to  point 
to  the  sector  with  the  most  targets.  However,  because  these  decisions  are  made  in  an 
asynchronous  manner,  it  is  possible  that  some  sensors  may  not  always  receive  all  the  bids 
before  decisions  are  due.  The  targets  always  send  their  bids  to  the  sensors  that  are  closest  to 
them. 

The  results  from  our  experiments  are  shown  in  Figure  1  where  we  can  see  that  there  is  a 
transition  in  the  system’s  performance  as  the  number  of  neighbors  goes  from  three  to  five. 
That  is,  if  the  targets  only  send  their  bids  to  three  sensors  then  it  is  almost  certain  that  the 
system  wiU  stay  in  a  configuration  that  has  a  very  low  global  utility.  However,  if  the  targets 
send  their  bids  to  five  sensors,  then  it  is  almost  guaranteed  (98%  of  the  time)  that  the 
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system  will  reach  the  globally  optimal  allocation. 


5.  RELATED  WORK 

There  is  ongoing  work  in  the  field  of  complexity  that  attempts  to  study  they  dynamics  of 
complex  adaptive  systems  [4].  Our  approach  is  based  on  ideas  borrowed  from  the  use  of  NK 
landscapes  for  the  analysis  of  co-evolving  systems.  As  such,  we  are  using  some  of  the  results 
from  that  field.  However,  complexity  theory  is  more  concerned  with  explaining  the  dynamic 
behavior  of  existing  systems,  while  we  are  more  concerned  with  the  engineering  of  multi¬ 
agent  systems  for  distributed  service  allocation.  The  Collective  Intelligence  (COIN) 
framework  [12]  shares  many  of  the  same  goals  of  our  research.  They  start  with  a  global 
utility  function  from  which  they  derive  the  rewards  functions  for  each  agent.  The  agents  are 
assumed  to  use  some  form  of  reinforcement  learning.  They  show  that  the  global  utility  is 
maximized  when  using  their  prescribed  reward  functions.  They  do  not,  however,  consider 
how  agent  communication  might  affect  the  individual  agent’s  utility  landscape.  The  task 
allocation  problem  has  been  studied  in  [7],  but  the  service  allocation  problem  we  present  in 
this  paper  has  received  very  little  attention.  There  is  also  work  being  done  on  the  analysis  of 
the  dynamics  of  multi-agent  systems  for  other  domains  such  as  e-commerce  [5]  and 
automated  manufacturing  [6].  It  is  possible  that  extensions  to  our  approach  will  shed  some 
Hght  into  the  dynamics  of  these  domains. 

6.  CONCLUSIONS 


We  have  formalized  the  service  allocation  problem  and  examined  a  general  approach  to 
solving  problems  of  this  type.  The  approach  involves  the  use  of  utiHty-maximizing  agents 
that  represent  the  resources  and  the  services.  A  simple  form  of  bidding  is  used  for 
communication.  An  analysis  of  this  approach  reveals  that  it  implements  a  form  of  distributed 
hill-climbing,  where  each  agent  climbs  its  own  utility  landscape  and  not  the  global  utility 
landscape.  However,  we  showed  that  increasing  the  amount  of  communication  among  the 
agents  forces  each  individual  agent’s  landscape  to  become  increasingly  correlated  to  the 
global  landscape. 

These  theoretical  results  were  then  verified  in  our  implementation  of  a  sensor  allocation 
problem,  which  is  an  instance  of  a  service  allocation  problem.  Furthermore,  the  simulations 
allowed  us  to  determine  the  location  of  a  phase  transition  in  the  amount  of  communication 
needed  for  the  system  to  consistently  arrive  at  the  globally  optimal  service  allocation.  More 
generally,  we  have  shown  how  a  service  allocation  problem  can  be  viewed  as  a  distributed 
search  by  multiple  agents  over  multiple  landscapes.  We  also  showed  how  the  correlation 
between  the  global  utility  landscape  and  the  individual  agent’s  utility  landscape  depends  on 
the  amount  and  type  of  inter-agent  communication.  Specifically,  we  have  shown  that 
increased  communications  leads  to  a  higher  correlation  between  the  global  and  individual 
utility  landscapes,  which  increases  the  probability  that  the  global  optimum  will  be  reached. 
Of  course,  the  success  of  the  search  still  depends  on  the  connectivity  of  the  search  space, 
which  will  vary  from  domain  to  domain. 
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We  expect  that  our  general  approach  can  be  applied  to  the  design  of  any  multi-agent  systems 
whose  desired  behavior  is  given  by  a  global  utility  function  but  whose  agents  must  act 
selfishly.  Our  future  work  includes  the  study  of  how  the  system  will  behave  under 
perturbations.  For  example,  as  the  target  moves  it  perturbs  the  current  allocation  and  the 
global  optimum  might  change.  We  also  hope  to  characterize  the  local  to  global  utility 
function  correlation  for  different  service  allocation  problems  and  the  expected  time  to  find 
the  global  optimum  under  various  amounts  of  communication. 
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B.  SC  TRACKER 


1.  INTRODUCTION 

This  part  of  the  report  describes  a  multi-target  tracking  system  developed  to  support  the 
ANTS  challenge  problem  (CP).  The  tracking  system  is  based  on  a  Bayesian  approach  [18] 
that  estimates  target  locations  and  velocities  by  fusing  amplitude  and  frequency 
measurements  obtained  from  Doppler  Radar  sensors.  The  tracking  system  was  designed  to 
operate  in  conjunction  with  Doppler  Radar  sensors  developed  by  the  BAE  Sanders  [26],  or 
their  simulation  version,  called  RADSIM,  developed  at  the  Air  Force  Research  Laboratory 
[25].  The  simulated  and  hardware  sensors  are  devices  that  return  values  of  amplitude  and 
frequency.  The  amplitude  is  proportional  to  the  distance  between  the  sensor  and  the  target, 
and  the  frequency  is  proportional  to  the  angular  velocity  of  the  target,  as  seen  by  the  sensor. 
Equations  4  and  5  describe  the  amplitude  and  frequency  models;  Figure  4  shows  the 
geometry  involved  to  determine  amplitude  and  frequency  given  the  target  velocity,  its 
location,  and  the  location  of  the  sensor. 

The  tracker  system  fuses  data  into  a  probability  space  to  obtain  an  estimate  of  a  target’s  state 
at  any  given  time  [15,  18,  21,  22,  23].  The  state  of  a  target,  represented  by  its  location  and 
velocity,  are  not  directly  observable  from  the  sensors  data.  Instead,  the  sensors  return 
amplitude  and  frequency  measurements  that  are  related  to  the  target  location,  its  velocity, 
and  sensor  parameters,  including  sensor  location,  sector  selection,  gain,  type  of 
measurement,  beam  width,  number  of  pulses  within  a  measurement,  and  power. 

2.  SENSOR  MODEL 

In  general,  a  target  moving  in  a  two-dimensional  plane  can  be  represented  by  a  vector 
given  by, 

X 

T 

Where  x  and  y  are  the  target’s  coordinates,  and  and  are  the  target’s  velocity  in  the  x 

and  y  direction,  respectively,  in  a  fixed  reference  frame.  Given  that  the  sensor  returns  a 
measurement  then  a  sensor  model  for  that  measurement,  m-,  can  be  expressed  as 

[eq2] 

The  sensor  model  M^.  is  a  function  of  the  target  state  and  the  parameters  ^^of  the 

model.  The  model  could  be  based  on  fundamental  principles  of  physics  or  be  formulated 
from  empirical  observations  [20,  22,  24].  In  either  case  it  is  likely  that  the  functional 


[eq  1] 
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model  M ^  .  might  not  explain  the  measurement  m.  completely.  Errors  may  occur  because 

the  model  might  be  insufficient  and  may  not  take  into  consideration  all  the  possible 
interactions  within  the  system  or  because  the  experimental  conditions  may  go  beyond 
control.  Hence  a  better  representation  of  the  model  is, 


m 


[eq3] 


Where  6^-  is  the  error  term  in  the  measurement  of  the  sensor.  During  the 

development  of  the  tracking  systems  two  sensor  models  were  used.  The  first  sensor  model 
was  given  by  the  system  of  equations: 


A  =  K  exp 


[eq4] 


ft  =  C 


Where  is  a  value  of  amplitude  and  ^  is  a  value  of  frequency  given  the  range  (the 
distance  between  the  sensor  and  the  target).  K,  C,  and  a,  are  parameters  of  the  sensor 
model.  This  model  had  a  correction  factor  given  by  equation  5 

4  „  =4(i+0.05N(0,i))+4  /(i+0.05N(o,i)) 

Jk.n  ^ 

^+^ffk 


i.e.,  the  value  of  amplitude  was  affected  by  Gaussian  noise  compounded  at  a  level  of  5%, 
while  the  value  of  frequency  had  an  error  factor  of  . 


The  first  sensor  model  was  replaced  by  a  model  that  contained  a  noise  factor  given  by  a 
uniform  distribution  around  [-0.2,  0.2].  The  second  model  considered  amplitude  as  a 
function  of  frequency,  which  in  turn  was  a  function  of  an  interpolation  function  Cj. 


^C.(/)exp 


a 


BW^ 


[l  +  t/(-0.2,0.2)] 


[eq6] 


3.  PROCESS  MODEL 

In  addition  to  the  sensor  model,  the  tracker  uses  a  process  model  to  predict  target  location 
and  velocity  [14,  16,  17,  18,  19,  25,  26].  The  process  model  contains  data  structures  and 
algorithms  that  were  specifically  designed  to  cope  with  the  asynchronous  nature  of  the 
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CP  and  the  incompatible  measurement  spaces  of  the  data  sources.  The  process  model 
relates  the  history  of  the  target  states  to  potential  future  states.  Let  H{t^  denote  the  history 
of  target  states  until  time  t.  If  (^)  denotes  the  target  state  at  time  /,  then 

//(/)=  [r,(l)  ...  UDJ  [eq7] 

i.e.,  H(t)  is  the  set  of  all  target  states  until  time  /.  A  general  motion  model  of  the  target’s 
trajectory  can  be  ascertained  from  the  history  of  the  target  states.  Given  the  current  target 
state  and  the  distribution  of  the  velocity  obtained  from  that  history,  it  is  feasible  to  project 
that  information  into  the  future  and  obtain  a  distribution  of  the  most  likely  target  locations. 
This  information  can  be  used  with  the  measurements  obtained  at  time  /+/  to  locate  the 
target. 


+  =  [eq  8] 

Where  ,  . . .  ,  are  the  measurements  from  N  sensors  at  time  t+1 . 

The  process  model  assumes  that  the  clocks  on  all  the  sensors  are  synchronized.  However 
the  hardware  implementation  of  the  challenge  problem  is  not  compatible  with  this 
assumption.  One  of  the  constraints  of  the  CP  architecture  is  that  a  sensor  returns  one  single 
measurement  during  a  time  t.  Yet,  to  estimate  the  target  state  at  any  time,  it  is  necessary  to 
have  as  many  measurements  as  possible  during  that  period.  Another  constraint  is  that 
measurements  from  a  single  sensor  are  not  enough  to  track  a  target  with  a  high  degree  of 
accuracy. 

To  overcome  these  limitations,  the  process  model  assumes  that  the  motion  of  a  target 
follows  the  laws  of  inertia.  This  assumption  is  implemented  using  the  concept  of  motion 
through  time,  which  is  implemented  by  two  structures,  called  the  time  frames  and  the 
motion  model,  described  in  the  following  sections. 
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where 

V(t-l),  V(t),  V(t+1)  are  the  estimates  of  target  velocity, 

L(t-l),  L(t),  L(t+1)  are  the  estimates  of  target  location, 

A(sl,t)..  A(sn,t)  are  the  amplitude  measurements  from  sensors  si  ..  sn,  at  time=t. 
F(sl,t)..  F(sn,t)  are  the  frequency  measurements  from  sensors  si  ..  sn,  at  time=t. 
P(sl,t)..  P(sn,t)  are  the  operational  parameters  of  sensors  si  ..  sn,  at  time=t. 

Figure  1 :  Interactions  Between  the  State  and  Process  Models 


4.  TIME  FRAMES 

Given  the  asynchronous  nature  of  the  CP  infrastructure,  information  from  the  sensors  is 
never  guaranteed  to  arrive  at  the  tracker  on  time.  In  fact,  data  arriving  with  significant  delay 
are  the  norm  rather  than  the  exception.  To  cope  with  this  constraint,  it  was  hypothesized 
that  changes  in  the  target  state  are  very  small  during  the  period  iSTj- ,  and  that  all  sensor 

measurements  received  in  that  period  can  be  fused  to  obtain  an  approximate  estimate  of  that 
target  state.  This  assumption  led  to  the  discretization  of  time  into  a  series  of  time  frames  of 
length  iSTj- .  It  is  implied  that  the  target  location  in  a  time  frame  iSTj-  is  constant. 

Theoretically  the  best  results  are  obtained  by  making  approach  zero.  However  it  is  not 
feasible  to  obtain  sufficient  measurements  to  estimate  target  state  under  that  condition. 

The  tracker  uses  time  frames  to  accommodate  incoming  data  based  on  the  time  stamp  at 
which  data  was  produced  by  the  sensors.  These  data,  which  consist  of  amplitude  and 
frequency,  are  used  to  update  the  estimation  of  target  location  and  velocity  at  the  end  of 
each  time  frame.  Due  to  this  scheme,  data  belonging  to  the  past,  received  later  because  of 
communication  or  processor  delays,  is  not  ignored.  Instead,  those  data  are  used  to  update 
the  tracker  at  the  current  time  frame,  or  at  times  T-1 ,  T-2,  T-3,  etc.,  if  data  for  those 
frames  arrive  during  the  current  time  frame.  Since  measurements  are  continuously  added 
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under  a  timed-queued  scheme,  whenever  the  tracker  updates  location  and  velocity  for  the 
time-frame  at  time  T,  the  tracker  first  checks  if  there  is  any  data  for  previous  time  frames.  If 
new  data  indeed  came  for  previous  time  frames,  then  the  estimates  of  target  state  for  those 
frames  are  updated,  and  the  new  predictions  are  propagated  forward.  The  tracker  ignores 
measurements  whose  time  stamp  is  delayed  more  than  4  time  frames.  Although  this  number 
is  easily  set  up  as  an  operational  parameter,  the  number  is  consistent  with  theoretical  results 
from  hidden  Markov  models,  and  constitutes  a  good  compromise  that  has  little  impact  on 
performance  [18]. 

Figure  2  shows  the  sequence  of  data  update  that  would  take  place  at  frame  4000,  as  new 
‘Aid”  data  from  previous  frames  are  detected.  The  circles  are  color  coded:  Green  means 
that  the  data  item  arrived  on  time,  yellow  that  it  arrived  one  time  frame  later,  blue  two  times 
later,  red  three  time  frames  later.  The  location  in  the  time  frame  is  such  as  to  indicate 
approximately  the  time  at  which  the  item  arrived.  For  example,  in  time  frame  1000,  data 
items  were  produced  at  the  sensors  at  times  1000,  1250,  1500,  and  1800.  Of  these,  only  the 
first  two  items  arrived  on  time,  the  data  item  for  t—1500  arrived  one  frame  later,  and  the  data 
item  for  t—1800  arrived  three  frames  later.  Similarly,  the  data  item  for  t—2800  arrived  at  that 
frame  two  frames  later,  etc. 
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Figure  2  Time  Frames 


In  a  given  time  frame  some  sensors  may  return  more  than  one  measurement.  This  would  be 
illustrated  in  the  last  time  frame,  where  Sensor  1  returns  measurements  at  times  t—4100  and 
t—4800.  Two  approaches  were  taken  to  deal  with  this  issue.  The  first  approach  considered 
only  the  most  recent  measurement  and  ignored  the  previous  measurements  within  the  frame. 
Each  measurement  has  an  intrinsic  background  noise  that  is  not  constant.  If  the 
assumption  is  made  that  background  noise  fluctuates  around  zero,  then  averaging  the 
measurements  actually  diminishes  the  effect  of  fluctuations,  and  the  result  is  a  more  accurate 
representation  of  the  target  state.  Therefore  the  approach  adopted  by  the  tracker  is  to 
average  the  measurements  within  the  same  frame. 

Although  some  of  these  concepts  may  not  be  applicable  as-is  to  other  tracking  problems,  the 
time  frame  concept  offers  a  reasonable  insight  that  is  applicable  to  many  tracking  platforms. 


5.  LOCATION  MODEL 

The  measurements  obtained  from  the  hardware  sensors  or  from  RADSIM  are 
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stored  in  a  time-stamped  queue  of  time  frames.  Each  time  frame  has  a  start  time  and  an  end 
time.  The  end-time  of  frame  T-1  is  the  start  time  of  frame  T.  A  measurement  belongs  to  a 
time  frame  if  that  measurement’s  time  is  between  the  start  and  end  time  of  that  time  frame. 

The  tracker  uses  an  internal  clock  to  keep  track  of  time.  As  time  progresses,  the  clock 
reaches  a  point  at  which  the  tracker  time  is  greater  than  the  end  time  of  the  current  time 
frame.  When  that  occurs,  a  method,  called  updateTracker^  is  invoked.  The  method  first  checks 
if  new  data  has  just  arrived  for  previous  time  frames.  If  new  data  exist  in  those  frames  then 
the  frames  are  updated  before  updating  the  current  time  frame,  and  the  update  proceeds 
forward.  The  method  uses  two  data  handlers  to  manage  amplitude  and  frequency 
measurements  from  the  sensors.  The  amplitude  handler  uses  the  sensor  model  to  fuse 
information  and  obtain  a  probabilistic  distribution  of  the  target  location.  This  distribution  is 
updated  using  the  motion  model,  which  is  in  turn  an  expression  of  the  history  of  target 
states.  The  frequency  handler  uses  the  frequency  measurements  to  estimate  the  target’s 
velocity. 


6.  AMPLITUDE  HANDLER 

The  two-dimensional  target  domain  is  divided  into  a  set  of  mxn  grid  cells  [13,  18].  The 
measurements  from  the  sensors  are  used  to  compute  the  probability  of  a  target  being  in  each 
grid  cell.  The  grid  cell  is  mapped  to  the  center  of  the  cell  as  a  single  point. 


Consider  two  amplitude  measurements  obtained  from  two  different  sensors 

A  distribution  of  target  location  in  each  cell  of  the  grid  can  be  obtained  by  fusing  these 
measurements.  The  probability  of  the  target  being  in  each  cell  where  i  =  \...m  and 

j  =  l...n  is  computed  given  the  measurements , ^2 •  Let  this  probability  be  denoted 
hyp[x  =  XiJ  =  yj\M^=A^,M2  =  ^7)'  The  measurements  A^^A^  are  assumed  to  be  independent; 
thus  measurements  from  one  sensor  do  not  have  any  affect  on  the  measurements  of  another 
sensor.  This  implies  that. 


p{x  =  X,.,7  =  yj\M,  =  =  ^2}=  =  yj\M^  =  A,)x 

p[x  =  Xi,Y  =  yj\M^  =  A^) 


[eq9] 


From  Bayes  theorem  we  can  write  each  term  in  the  right  hand  side  of  the  equation  as. 


p{x  =  Xi,Y  =  yj\M^=A^)= 
p{x  =  Xt,Y  =  yj\M2=A2)= 


p(mi  =Ai\x  =  Xi,Y  =  yj)p(x  =  x,.,7  =  y  j) 
t  tp[Mi  =  A\x  =  Xi,Y  =  y)A’{x  =  Xi,Y  =  y^) 

i=\j=\ 

p{Al2=A2\x  =  x,J  =  y^{x  =  Xi,Y  =  yl) 
f  tp[M2  =  A2\x  =  x,,Y  =  yj}p(x  =  Xi,Y  =  y^) 

i=\j=\ 


[eqlO] 


The  term  p[x  =  XiJ  =  yj)  is  the  a-prior  probability  associated  with  the  target  being 
in  location  and  p[x ^x^j ^yj\M^^ h  the  posteriori  probability  after  the 


16 


measurement  was  obtained.  Assuming  a  model  for  the  amplitude  that  is  a  function  of 
the  target  location, 


A=M^{X,Y,e)+6^ 


[eqll] 


in  equation  11  ^  is  a  sensor-specific  parameter  vector  and  6*^  is  the  measurement  error.  If 
is  assumed  to  be  distributed  normally  with  a  standard  deviation  then, 


°[x  =  x„Y  =  yj\M^=A,)cK 


-exp 


2cri 


ix  =  Xi,Y  =  yj) 


[eq  12] 


Using  equation  12  the  posterior  probability  distribution  of  the  target  location  given 
measurements  from  N  different  sensors  can  be  obtained.  If  the  measurement  error  for  each 
sensor  is  assumed  to  be  normal,  the  equation  above  becomes 


p{x  =  Xi,Y  =  yj\Mi  =  =  ^;v)<x  -j=2 


N 

.  n  cr, 

VA:=1 


.exp 


2 

k,A 


_  ^^k.A(^hyjA)-A} 


2(j. 


k,A 


P{x  =  x,,7  =  yj\Mi  =  A„...,Mn  =  ^n)=  ' 


.exp 


2a; 


k,A 


[eql3] 


Where  is  the  variance  in  amplitude  measurements,  Oj^  is  the  parameter  vector, 

^ k  A^i'>y amplitude  model  of  the  sensor,  is  the  measurement  received 
from  that  sensor,  and  C  is  a  normalization  constant. 


An  example  that  illustrates  the  fusion  of  two  amplitude  measurements  to  obtain  a  probability 
distribution  of  target  location  is  shown  in  Figure  3. 

Clearly,  it  is  not  possible  to  track  a  target  with  just  one  or  two  amplitude  measurements;  the 
region  of  high  probability  is  too  wide.  To  resolve  this  issue,  either  more  measurements  are 
needed,  or  a  motion  model  that  could  resolve  the  ambiguities  is  needed.  Ideally,  the  motion 
model  would  use  the  history  of  target  states  to  estimate  velocity. 


7.  FREQUENCY  HANDLER 


A  frequency  model  for  the  sensor  relates  the  velocity  of  the  target  to  the  sensor’s  frequency 
measurement.  It  is  assumed  that  frequency  is  a  linear  function  of  the  radial  velocity  with 
respect  to  the  sensor,  i.e.. 


fk=c, 


kj 


R. 


[eq  14] 
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Where  is  the  frequency  measurement,  is  a  constant,  is  the  radial  velocity  with 
respect  to  sensor. 

A  single  frequency  measurement  is  not  sufficient  to  solve  for  target’s  velocity;  measurements 
from  at  least  two  sensors  are  required.  Even  in  the  presence  of  two  measurements,  a 
technique  to  realize  the  radial  direction  (whether  the  target  is  moving  towards  or  away  from 
the  sensor)  is  required. 


Figure  3  (a)  Target  Probability  Distribution  From  Single  A.mplitude  Measurement. 

(b)  Target  Probability  Distribution  From  Fusing  two  Amplitude  Measurements 


Figure  4  shows  the  relation  between  the  sensor  location  and  the  target  velocity. 
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R  :  Radial  velocity  with  respect  to  the  sensor 
a  :  Angle  between  target  and  sensor 

:  Velocity  of  the  target  in  x  and  y  direction  respectively 

) :  Sensor  coordinates 
(xy,  ):  Target  coordinates 
R  :  Target  distance  from  the  sensor 

Figure  4:  Relation  Between  the  Sensor  Location  and  the  Target  Velocity 

The  frequency  measurements  from  the  sensors  can  be  converted  into  radial  velocities  using 
the  equation  for  frequency.  Let  R.  denote  the  radial  velocity  of  a  target  computed  from  the 

frequency  of  the  i  sensor.  The  target  velocity  components  v^,v^  are  related  to  R.  as, 

cos  a.  +  sin  a.  =  R.  [eq  15] 

Where  a.  denotes  the  angle  between  the  sensor  and  target  coordinates.  The  tracker 
solves  this  equation  simultaneously  for  and  whenever  frequency  measurements  are 
obtained  from  more  than  two  sensors  within  the  same  time  frame. 


8.  MOTION  MODEL 

A  motion  model  was  developed  using  recent  history  of  target  states  [14,  18,  20].  This  can  be 
viewed  as  a  learning  technique  of  the  target’s  motion,  based  on  the  assumption  that  the 
target  is  subjected  to  the  laws  of  inertia.  This  assumption  is  consistent  with  the  CP;  the 
targets  being  detected  have  a  constrained  pattern  of  motion.  Other  'Tommon-sense” 
constraints,  like  the  fact  that  the  target  cannot  change  its  direction  abruptly,  or 
cannot  move  faster  than  a  pre-specified  maximum  velocity,  are  also  assumed.  Some 
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of  the  advantages  of  the  motion  model  are, 

•  Since  the  motion  model  captures  the  velocity  distribution,  using  the  model  in 
conjunction  with  the  amplitude  handler  results  in  more  accurate  distributions  of 
target  location.  As  discussed  in  a  previous  section,  the  amplitude  handler  sometimes 
gives  two  clusters  as  the  most  likely  target  locations.  Using  a  probabilistic  weighted 
mean  in  such  circumstances  to  predict  target  location  may  lead  to  erroneous 
estimations.  Multi-modal  distributions  of  target  location  can  be  avoided  most  of  the 
times  by  using  the  motion  model. 

•  The  motion  model  can  help  to  predict  future  target  states  and  hence  optimize  the 
use  of  CP  resources  (sensors)  from  this  information. 

•  Estimates  of  target  location  can  be  obtained  even  when  the  sensors  fail  to  report 
measurements. 


The  motion  model  builds  a  probabilistic  estimate  from  previous  target  states.  There  are  two 
techniques  to  compute  target  velocity.  One  uses  the  previous  target  locations  to  find  the 
fraction  of  distance  that  the  target  has  moved  in  time  t.  The  second  uses  frequency 
measurements  from  the  sensors  to  compute  the  target  velocity  [14,  18,  20]. 

A  fixed  number  of  time  frames  in  the  past  are  used  to  build  the  motion  model. 


The  velocity  of  the  target  has  two  components,  the  magnitude  and  the  direction  of  motion. 
Hence  the  model  is  split  into  two  parts,  one  for  the  distance  model  and  one  for  the  direction 
model.  An  average  value  of  the  speed  and  the  direction  can  be  computed  from  the  previous 
time  frames. 


Let  v,y5  be  the  average  speed  and  direction  computed  from  Ay  time  frames.  Let 

be  the  previous  target  location  estimated  at  time  .  Assume  that  the  amplitude  model  gives 

the  target  location  distribution  from  the  amplitude  measurements.  The  distribution  is 
updated  using  the  distance  and  the  direction  models. 


A  new  target  location  (x^,  j^^)at  time  is  computed  using 

x^=x^+{vcos^Xt^-t^) 

(-  oV  ^ 

yc=yp+v^^^/^Nc-tp) 

Distance  Model'.  The  target  domain  is  divided  into  mxn  grid  cells.  Let  ] 

represent  a  cell  in  the  grid.  If  d^j  is  the  distance  between  the  points  (x. ,  y j  ]  and 
i^cDc)^  the  probability  of  a  target  being  in  cell  (x.,j^^. )  is  computed  using  a 
Gaussian  model 


p(x, ,  yj  \H(t^ ))  OC  J—  exp 


d,. 


[eql7] 


^  J 


Where  H{t^)  is  the  target  history  considered,  (J^  is  the  standard  deviation 
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assumed  for  the  distance  model. 

Direction  Model',  Let  denote  the  angle  between  ]  and  Assume  a 

standard  deviation  cr^  in  the  direction  estimate  P  .  The  probability  estimate  for  the 


location  . ,  y j  ]  using  the  direction  model  is  given  by 


p(x,,y\H(,t,))<x  — L=  exp 


i 


Ina 


V 


j 


[eql8] 


The  joint  distribution  of  the  distance  and  direction  model  gives  the  target  location 
distribution  given  the  history expressed  as: 


P(x, ,  j .  \H  (t  j)  oc  exp 

V 


■  + 


k -p) 


J) 


[eql9] 


9.  TARGET  LOCATION 

A  joint  probability  distribution  for  amplitude  measurements  and  motion  model  is  computed 
to  estimate  target’s  location.  Combining  the  equations  of  the  target  model  and  the  motion 
model  we  obtain  the  joint  probability  distribution 


p{x  =  Xi,Y  =  exp 


/ 

Uyk-pn 

V 

7=1  20-^,,  J 

cr  (7 

\  ^  p  )) 

[eq20] 


The  target  location  [x^ ,  )  is  given  by, 


f  i=m  j=n 
/=0  7=0 


i-rn  /  \  J=n  .  . 

I.Zp(yny,\A,-,^«,nk 


i=0  7=0 


i—tn  J  n  .  .  i—ni,  J  11^  .  . 

y  /=0  7=0  /=0  7=0 


i=m  J=n 


[eq21] 


Figure  5  illustrates  the  joint  of  the  two  equations. 
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Probability 


Figure  5  Motion  model:  (a)  Distance  Probability  Model 

(b)  Angle  Probability  Model 

(c)  Joint  Probability  Distribution  From  Distance  and  Angle  Probability 

(d)  Contour  Plot  of  the  Joint  Probability  Distribution 
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10.  RESULTS 


The  tracker  was  tested  using  several  target  configurations.  Figure  6  shows  a  typical  set  of 
results,  obtained  using  the  tracker  under  an  ''omniscient”  controller.  Since  this  controller 
knew  where  the  target  was,  the  controller  instructed  the  tracker  to  get  measurements  from 
the  "right  sectors,”  providing  a  means  to  establish  a  base-line  estimation  of  the  tracker 
performance.  Even  under  these  conditions  it  should  be  noted  that  the  measurements  were 
not  assured  to  arrive  at  the  right  time.  The  diamond  shaped  dots  are  the  target’s  predictions. 
The  square  dots  are  the  true  locations.  As  the  figure  shows,  there  is  an  insignificant 
difference  between  the  two  sets  of  data.  The  Root  Mean  Square  error  (RMS)  was  0.4  ft. 


Figure  6  Kesults  Obtained  with  the  SC  Tracker  During  the  First  Demonstration 


11.  MULTIPLE  TARGET  TRACKING 

The  multiple  target  version  was  grown  from  the  original  single-target  version.  In  essence, 
instead  of  maintaining  target  state  for  a  single  target,  the  multi-target  version  maintained 
state  for  N  targets.  Since  the  allocation  of  sensor  resources  was  decided  by  the  controllers, 
the  association  of  target  to  sensors  was  not  considered  in  the  multi-target  version. 

The  final  version  of  the  multiple  target  tracker  was  further  adapted  so  that  instead  of  keeping 
multiple  states  within  the  tracker,  multiple  instances  of  the  tracker  were  spawned  by  their 
controller  agents.  The  resulting  architecture  is  that  shown  in  Figure  7.  Since  the  controller 
agent  was  the  piece  of  software  in  which  negotiation  was  to  be  made,  the  controller 
determined  which  resources  were  needed,  and  based  on  those  decisions,  sensor 
measurements  were  feed  into  the  appropriate  instance  of  a  tracker  thread,  where 
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predictions  of  target  state  were  made  and  communicated  to  the  controller. 

As  in  previous  demos,  the  performance  of  the  tracker  was  quite  acceptable,  as  shown  in 
Figure  7. 


Figure  7  Kesults  Obtained  with  the  SC  Tracker  During  the  Tinal  Demonstration 
12.  CONCLUSIONS 

We  have  presented  a  multiple-target  tracking  system  that  uses  probabilistic  data  fusion  to 
estimate  target  location  and  velocity  from  sensor  data.  The  system’s  architecture  provides  a 
high  degree  of  separation  between  the  data  sources,  the  tracking  logic,  and  the  state 
transitions,  which  could  be  adapted  to  other  tracking  and  data  fusion  scenarios. 

The  two  major  components  of  the  tracker,  the  state  model  and  the  process  model,  are 
based  on  Bayesian  estimation  theory.  This  makes  it  possible  to  fuse  data  from  different 
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measurement  spaces  into  a  rich,  unified,  representation  scheme,  and  offers  the  following 
advantages: 

•  Robust  operational  behavior:  Multi-sensor  data  fusion  has  an  increased  robustness 
when  compared  to  single  sensor  data  fusion.  When  one  sensor  becomes  unavailable 
or  is  inoperative,  other  sensors  can  provide  information  about  the  environment. 

•  Extended  spatial  and  temporal  coverage:  Some  parts  of  the  environment  may  not  be 
accessible  to  some  sensors  due  to  range  limitations.  This  occurs  especially  when  the 
environment  being  scanned  is  vast.  In  such  scenarios,  multiple  sensors  that  are 
mounted  at  different  locations  can  maximize  the  regions  of  scanning.  Multi-sensor 
data  fusion  provides  increased  temporal  coverage  as  some  sensors  can  provide 
information  when  others  cannot. 

•  Increased  confidence:  The  confidence  in  detection  of  targets  is  increased  in  multi¬ 
sensor  data  fusion.  Single  target  location  can  be  confirmed  by  more  than  one  sensor 
and  this  increases  the  users  confidence  in  target  detection. 

•  Reduced  ambiguity:  Joint  information  from  multiple  sensors  can  reduce  the  set  of 
beliefs  about  the  target. 

•  Decreased  costs:  Multiple,  inexpensive  sensors  can  replace  expensive  single  sensor 
architecture  at  a  significant  reduction  of  cost. 

•  Improved  detection:  Integrating  measurements  from  multiple  sensors  can  reduce 
signal  to  noise  ratio,  which  ensures  improved  detection. 
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